of A Data Locality Optimizing Algorithm
نویسنده
چکیده
What problem did the paper address? The big picture problem is how can we improve program performance given the large latency between the processor and memory. An approach that has been used in the past is a transformation called tiling. The paper addresses the problem that not all loops are initially tileable. Specifically they answer the question, what combination of loop permutation, skewing, and reversal (unimodular transformations) should be used to make a loop tileable in a way that will improve data locality? The problem is hard. Just finding a legal set of unimodular transformations is exponential in the number of loops [SD90].
منابع مشابه
A Data Layout Optimization Technique Based on Hyperplanes
This paper presents a data layout optimization technique based on hyperplane theory from linear algebra. Given a program, our framework automatically determines the optimal layouts that can be expressed by hyperplanes for each array that is referenced. We discuss the cases where data transformations are preferable to loop transformations and show that under certain conditions a loop nest can be...
متن کاملA New Algorithm for Global Optimization for Parallelism and Locality
Converting sequential programs to execute on parallel computers is diicult because of the need to globally optimize for both par-allelism and data locality. The choice of which loop nests to parallelize, and how, drastically aaects data locality. Similarly, data distribution directives, such as DISTRIBUTE in High Performance Fortran (HPF), affects available parallelism and locality. What is nee...
متن کاملImproving Cache Locality by a Combination of Loop and Data Transformations
ÐExploiting locality of reference is key to realizing high levels of performance on modern processors. This paper describes a compiler algorithm for optimizing cache locality in scientific codes on uniprocessor and multiprocessor machines. A distinctive characteristic of our algorithm is that it considers loop and data layout transformations in a unified framework. Our approach is very effectiv...
متن کاملA Locality Optimizing Algorithm for Developing Stream Programs in Imagine
In this paper, we explore a novel locality optimizing algorithm for developing stream programs in Imagine to sustain high computational ability. Our specific contributions include that we formulate the relationship between streams and kernels as a Data&Computation Matrix (D&C Matrix), and present the key techniques for locality enhancement based on this matrix. The experimental results on five ...
متن کاملImproving Cache Locality by a Combination of Loop and Data Transformation
Exploiting locality of reference is key to realizing high levels of performance on modern processors. This paper describes a compiler algorithm for optimizing cache locality in scientific codes on uniprocessor and multiprocessor machines. A distinctive characteristic of our algorithm is that it considers loop and data layout transformations in a unified framework. Our approach is very effective...
متن کاملA Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
This paper presents a data layout optimization technique for sequential and parallel programs based on the theory of hyperplanes from linear algebra. Given a program, our framework automatically determines suitable memory layouts that can be expressed by hyperplanes for each array that is referenced. We discuss the cases where data transformations are preferable to loop transformations and show...
متن کامل